We’ll be using gganimate to do our animating. If you don’t already have gifski or av as an installed library, you’ll want to do that (these are what support the creation of GIF and movie files respectively.

Yet again, we’ll be using the gapminder dataset.

Animated Bubble Chart

Load the necessary libraries:

library(gapminder)
library(tidyverse)
library(gganimate)

Remember that gganimate is built on top of ggplot, so let’s first get a solid static visualization that we can use as our base. We’ll use a robust, but completely standard ggplot call:

  1. Call ggplot to use the gapminder data, with gdpPercap mapped to X axis, lifeExp to Y, pop to size, and continent to color
  2. Make it a scatterplot, since there are a lot of points, let’s give it some transparency by setting the alpha to 0.7, and let’s hide the legend.
  3. The data came with pre-set colors for each country, let’s use it.
  4. Let’s scale the size of the points to something reasonable, 2 on the low end and 12 at max.
  5. Put the X axis on a log scale.
  6. Facet by continent
  7. Set our labels.
p1 <- ggplot(gapminder, aes(gdpPercap, lifeExp, size = pop, color = country)) +
   geom_point(alpha = 0.7, show.legend = FALSE) +
   scale_color_manual(values = country_colors) +
   scale_size(range = c(2, 12)) +
   scale_x_log10() +
   facet_wrap(~continent) +
   theme_bw() +
   labs(title = "Year: 1952-2007", x = "GPD per capita", y = "Life Expectancy")

print(p1)

To turn it into an animation, we simply add a few functions:

  1. a new labs function overwrites the previous one, so we can dynamically display the changing years as the data points move across the plot.
  2. the transition_time function takes in the year variable as an input and it allows the animated plot to transition frame by frame as a function of the year variable.
  3. ease_aes function takes in linear as an input argument and it defines the transition of the frame to be in a linear fashion.
  4. since we saved this to p2, we now need to explicitly display it, so we call the animate function.
  5. the anim_save function allows the rendered animated plot to a .GIF file.
p2 <- p1 +
   labs(title = "Year: {frame_time}", x = "GDP per capita", y = "Life Expectancy") +
   transition_time(year) +
   ease_aes('linear')

animate(p2)

anim_save("gapminder1.gif")

Show preceding frames with gradual falloff

We can use shadow_wake() to draw a small wake after the data by showing the latest frames up to the current. You can choose to gradually diminish the size and/or opacity of the shadow. The length of the wake is not given in absolute frames, it is given as a proportion of the total length of the animation.

p3 <- p2 + 
   shadow_wake(wake_length = 0.3, alpha = FALSE)

animate(p3)

anim_save("gapminder2.gif")

Show the original data as trail

Alternatively we can use shadow_trail() to show the original data as a trail.

p4 <- p2 +
   shadow_trail()

animate(p4)

anim_save("gapminder3.gif")

Reveal data along a given dimension

We’ve created a standard line plot of lifeExp by country, faceted by continent.

p5 <- ggplot(gapminder %>% filter(continent == "Asia"), aes(year, lifeExp, color = country)) +
   geom_line(show.legend = FALSE)

p5

We can then call transition_reveal to let the data gradually appear, by year. The geom_point call means that as it appears it shows a point.

p6 <- p5 + 
   geom_point(show.legend = FALSE) +
   transition_reveal(year)

animate(p6)

anim_save("gapminder4.gif")

Morphing Bar Charts

Here we create a bar chart and then add an additional aesthetic called transition_states that provides a frame variable of year. For each value of the variable, a step on the chart will be drawn.

p7 <- gapminder %>% 
   group_by(year, continent) %>% 
   summarize(cont_pop = sum(pop)) %>% 
   ggplot(aes(continent, cont_pop, fill = continent)) +
   geom_bar(stat = "identity") +
   transition_states(year, transition_length = 2, state_length = 2) +
   ease_aes('sine-in-out') +
   labs(title = "Population in {closest_state}")

animate(p7)

anim_save("gapminder5.gif")

Barchart Race

First, we get the data prepped, which includes grouping by year, sort descending by population, assigning the rank, and then filtering to the top 10 for each year.

ranked_by_year <- gapminder %>% 
   select(country, pop, year, continent) %>% 
   group_by(year) %>% 
   arrange(year, -pop) %>% 
   mutate(rank = min_rank(-pop)) %>% 
   filter(rank <= 10)

ranked_by_year
## # A tibble: 120 × 5
## # Groups:   year [12]
##    country              pop  year continent  rank
##    <fct>              <int> <int> <fct>     <int>
##  1 China          556263527  1952 Asia          1
##  2 India          372000000  1952 Asia          2
##  3 United States  157553000  1952 Americas      3
##  4 Japan           86459025  1952 Asia          4
##  5 Indonesia       82052000  1952 Asia          5
##  6 Germany         69145952  1952 Europe        6
##  7 Brazil          56602560  1952 Americas      7
##  8 United Kingdom  50430000  1952 Europe        8
##  9 Italy           47666000  1952 Europe        9
## 10 Bangladesh      46886859  1952 Asia         10
## # … with 110 more rows

Then we create a static plot:

  • using geom_rect which needs the four corners of the rectangle.
  • Make the rectangles somewhat transparent.
  • Facet by year
  • Reverse the y scale
  • Change the limits on X so we can display labels better.
  • Call geom_text for the country labels.

This gives us our static faceted plot.

p8 <- ranked_by_year %>% 
   ggplot(aes(xmin = 0, xmax = pop / 1000000, 
              ymin = rank - .45, ymax = rank +.45, y = rank,
              fill = continent)) +
   geom_rect(alpha = .7) +
   facet_wrap(~ year) +
   scale_y_reverse() +
   scale_x_continuous(limits = c(-800, 1400)) +
   geom_text(x = -50, 
             hjust = "right", 
             col = "grey",
             aes(label = country)) +
   labs(x = "Population (millions)", 
        y = "") +
   theme_void()

print(p8)

Then we remove the facet, refine the X scale, add a numeric label text, set the group aesthetic to country and then set the transition_time to one year. Then we save as a GIF.

p8 +
   facet_null() +
   scale_x_continuous(limits = c(-355, 1400)) +
   geom_text(x = 1000, y = -9.5, 
             aes(label = as.character(year)),
             size = 30, col = "grey") +
   aes(group = country) +
   transition_time(year)

anim_save("bar_race1.gif")

Another Racing Bar Chart demo

We begin by reading in the data direct from Git, select only the variables and observations we need. Then clean up the data (changing vars to numeric, renaming them), including pivoting it into a longer dataset. Then we group by year, we rank it, we get some relative values, we format a display label, and then limit it just to the top 10 for any given year. Here’s what that data now looks like:

library(janitor)
gdp <- read_csv("https://raw.githubusercontent.com/amrrs/animated_bar_charts_in_R/master/data/GDP_Data.csv")

gdp <- gdp %>% select(3:15)
gdp <- gdp[1:217,]
gdp_tidy <- gdp %>% 
   mutate_at(vars(contains("YR")), as.numeric) %>% 
   pivot_longer(cols = 3:13, names_to = "year") %>% 
   mutate(year = as.numeric(str_sub(year, 1, 4))) %>% 
   clean_names()


gdp_formatted <- gdp_tidy %>% 
   group_by(year) %>% 
   mutate(rank = rank(-value),
          value_rel = value/value[rank==1],
          value_label = paste0(" ", round(value/1e9))) %>%
   filter(rank <= 10)

gdp_formatted
## # A tibble: 110 × 7
## # Groups:   year [11]
##    country_name country_code  year   value  rank value_rel value_label
##    <chr>        <chr>        <dbl>   <dbl> <dbl>     <dbl> <chr>      
##  1 Brazil       BRA           1990 4.62e11    10    0.0773 " 462"     
##  2 Brazil       BRA           2000 6.55e11    10    0.0637 " 655"     
##  3 Brazil       BRA           2009 1.67e12     8    0.116  " 1667"    
##  4 Brazil       BRA           2010 2.21e12     7    0.148  " 2209"    
##  5 Brazil       BRA           2011 2.62e12     7    0.169  " 2616"    
##  6 Brazil       BRA           2012 2.47e12     7    0.153  " 2465"    
##  7 Brazil       BRA           2013 2.47e12     7    0.148  " 2473"    
##  8 Brazil       BRA           2014 2.46e12     7    0.141  " 2456"    
##  9 Brazil       BRA           2015 1.80e12     9    0.0995 " 1802"    
## 10 Brazil       BRA           2016 1.79e12     9    0.0963 " 1794"    
## # … with 100 more rows

From there, we build the plot. A few notes:

  • This version uses geom_tile, which is basically the same as geom_rect, they just take different arguments.
  • We use geom_text for the Name label.
  • We use another geom_text for the value label.
  • Call scales to pretty up the Y scale display
  • Reverse the order of the X axis.
  • Then flip it all so it’s a horizontal bar
  • Finally we do a big old total theme cleanup
p9 <- ggplot(gdp_formatted, aes(rank, group = country_name, 
                                fill = as.factor(country_name),
                                color = as.factor(country_name))) +
   geom_tile(aes(y = value/2,
                 height = value,
                 width = 0.9), alpha = 0.8, color = NA) +
   geom_text(aes(y = 0, label = paste(country_name, " ")), vjust = 0.2, hjust = 1) +
   geom_text(aes(y = value, label = value_label, hjust = 0)) +
   scale_y_continuous(labels = scales::comma) +
   scale_x_reverse() +
   coord_flip(clip = "off", expand = FALSE) +
   theme(axis.line=element_blank(),
         axis.text.x=element_blank(),
         axis.text.y=element_blank(),
         axis.ticks=element_blank(),
         axis.title.x=element_blank(),
         axis.title.y=element_blank(),
         legend.position="none",
         panel.background=element_blank(),
         panel.border=element_blank(),
         panel.grid.major=element_blank(),
         panel.grid.minor=element_blank(),
         panel.grid.major.x = element_line( size=.1, color="grey" ),
         panel.grid.minor.x = element_line( size=.1, color="grey" ),
         plot.title=element_text(size=25, hjust=0.5, face="bold", colour="grey", vjust=-1),
         plot.subtitle=element_text(size=18, hjust=0.5, face="italic", color="grey"),
         plot.caption =element_text(size=8, hjust=0.5, face="italic", color="grey"),
         plot.background=element_blank(),
         plot.margin = margin(2,2, 2, 4, "cm")) 

print(p9)

Now it’s time to animate. In the code below, we set the transition state to cycle through year, take 4 times as long going to the next cut as we do pausing there. We fix the X axis, but allow Y to vary, which is the default behavior. We also set the title label to vary so that it captures the closest state (year). Finally we animate and then save it as a GIF.

p10 <- p9 +
   transition_states(year, transition_length = 4, state_length = 1) +
   view_follow(fixed_x = TRUE) +
   labs(title = "GPD per Year :  {closest_state}",
        subtitle = "Top 10 Countries",
        caption = "GDP in Billions USD | Data Source: World Bank Data")

animate(p10)

anim_save("bar_race2.gif")

Dynamic Maps with Leaflet

Leaflet is a powerful open-source JavaScript library for building interactive maps in HTML.

The architecture is very similar to ggplot2, but instead of putting data-based layers on top of a static map, leaflet allows you to put data-based layers on top of an interactive map.

A leaflet map widget is created with the leaflet() command. We then add layers to the widget. The first layer that we will add is a tile layer containing all of the static map information, which by default comes from OpenStreetMap. The second layer we will add here is a marker, which designates a point location. Notice how the addMarkers() function can take a data argument, just like a geom_*() layer in ggplot2 would.

white_house <- tibble(
   address = "The White House, Washington, DC"
) %>% 
   tidygeocoder::geocode(address, method = "osm")

library(leaflet)

white_house_map <- leaflet() %>% 
   addTiles() %>% 
   addMarkers(data = white_house)

white_house_map

You can scroll and zoom at will!

You can also add a pop-up to provide more information about a particular location. Notice how we only need to call the previously saved leaflet map and then add a Popup layer to it.

white_house <- white_house %>% 
   mutate(title = "The White House", 
          street_address = "1600 Pennsylvania Ave")

white_house_map %>% 
   addPopups(data = white_house, 
             popup = ~paste0("<b>", title, "</b></br>", street_address))

There are several different providers of tiles. Below we’ll demonstrate two others, and we’ll also see how we can set a specific view and zoom level.

# Background 1: NASA
leaflet() %>% 
   addTiles() %>% 
   setView(lng = 2.34, lat = 48.85, zoom = 5) %>% 
   addProviderTiles("NASAGIBS.ViirsEarthAtNight2012")
# Background 2: World Imagery
leaflet() %>% 
   addTiles() %>% 
   setView(lng = 2.34, lat = 48.85, zoom = 3) %>% 
   addProviderTiles("Esri.WorldImagery")

Choropleth Maps

We’ll be showing 2016 House election results in NC.

# install.packages("fec16")
library(fec16)

nc_results <- results_house %>% # built in fec16 data
   mutate(district = parse_number(district_id)) %>% 
   left_join(candidates, by = "cand_id") %>% # candidates is also built in fec16 data
   select(state, district, cand_name, party, general_votes) %>% 
   arrange(desc(general_votes)) %>% 
   filter(state == "NC") %>% 
   group_by(state, district) %>% 
   summarize(N = n(), 
             total_votes = sum(general_votes, na.rm = T),
             d_votes = sum(ifelse(party == "DEM", general_votes, 0), na.rm = T),
             r_votes = sum(ifelse(party == "REP", general_votes, 0), na.rm = T),
             other_votes = total_votes - d_votes - r_votes, 
             r_prop = r_votes / total_votes, 
             winner = ifelse(r_votes > d_votes, "Republican", "Democrat"))

nc_results
## # A tibble: 13 × 9
## # Groups:   state [1]
##    state district     N total_votes d_votes r_votes other_votes r_prop winner   
##    <chr>    <dbl> <int>       <dbl>   <dbl>   <dbl>       <dbl>  <dbl> <chr>    
##  1 NC           1     3      350699  240661  101567        8471  0.290 Democrat 
##  2 NC           2     8      390567  169082  221485           0  0.567 Republic…
##  3 NC           3     5      323701  106170  217531           0  0.672 Republic…
##  4 NC           4     3      409541  279380  130161           0  0.318 Democrat 
##  5 NC           5     5      355512  147887  207625           0  0.584 Republic…
##  6 NC           6     3      351150  143167  207983           0  0.592 Republic…
##  7 NC           7     2      347706  135905  211801           0  0.609 Republic…
##  8 NC           8     3      323045  133182  189863           0  0.588 Republic…
##  9 NC           9     4      332493  139041  193452           0  0.582 Republic…
## 10 NC          10     5      349744  128919  220825           0  0.631 Republic…
## 11 NC          11     3      359508  129103  230405           0  0.641 Republic…
## 12 NC          12    10      349300  234115  115185           0  0.330 Democrat 
## 13 NC          13    22      355492  156049  199443           0  0.561 Republic…

Now we need a congressional district shapefile for the 114th Congress. Remember that the USAboundaries package has CD files. We also need to load up the sf library so we can work with sf data.

library(sf)
library(USAboundaries)
nc_map <- us_congressional(resolution = "high", states = "NC")

ggplot(nc_map) +
   geom_sf()

We need to merge in the election data with the shape file. The simplest way to do that is to use an inner_join(). Here we merge the nc_shp polygons with the nc_results election data frame using the district as the key.

nc_merged <- nc_map %>% 
   mutate(district = str_remove(cd116fp, "^0+") %>% as.numeric) %>% 
   inner_join(nc_results, by = "district")

glimpse(nc_merged)
## Rows: 13
## Columns: 22
## $ statefp           <chr> "37", "37", "37", "37", "37", "37", "37", "37", "37"…
## $ cd116fp           <chr> "01", "06", "05", "13", "09", "07", "02", "11", "04"…
## $ affgeoid          <chr> "5001600US3701", "5001600US3706", "5001600US3705", "…
## $ geoid             <chr> "3701", "3706", "3705", "3713", "3709", "3707", "370…
## $ namelsad          <chr> "Congressional District 1", "Congressional District …
## $ lsad              <chr> "C2", "C2", "C2", "C2", "C2", "C2", "C2", "C2", "C2"…
## $ cdsessn           <chr> "116", "116", "116", "116", "116", "116", "116", "11…
## $ aland             <dbl> 15207152815, 10128871422, 10280081294, 4745301686, 1…
## $ awater            <dbl> 525752701, 209014034, 80701577, 105117478, 85773395,…
## $ state_name        <chr> "North Carolina", "North Carolina", "North Carolina"…
## $ state_abbr        <chr> "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC"…
## $ jurisdiction_type <chr> "state", "state", "state", "state", "state", "state"…
## $ district          <dbl> 1, 6, 5, 13, 9, 7, 2, 11, 4, 10, 8, 3, 12
## $ state             <chr> "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC"…
## $ N                 <int> 3, 3, 5, 22, 4, 2, 8, 3, 3, 5, 3, 5, 10
## $ total_votes       <dbl> 350699, 351150, 355512, 355492, 332493, 347706, 3905…
## $ d_votes           <dbl> 240661, 143167, 147887, 156049, 139041, 135905, 1690…
## $ r_votes           <dbl> 101567, 207983, 207625, 199443, 193452, 211801, 2214…
## $ other_votes       <dbl> 8471, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ r_prop            <dbl> 0.2896130, 0.5922910, 0.5840169, 0.5610337, 0.581822…
## $ winner            <chr> "Democrat", "Republican", "Republican", "Republican"…
## $ geometry          <MULTIPOLYGON [°]> MULTIPOLYGON (((-79.00854 3..., MULTIPOLYGON (((-80.…

We can use Leaflet. First we will define a color palette over the values [0,1] that ranges from red to blue. According to the documentation, colorNumeric():

Conveniently maps data values (numeric or factor/character) to colors according to a given palette, which can be provided in a variety of formats.

The domain parameter tells it the possible values that can be mapped.

pal <- colorNumeric(palette = "RdBu", domain = c(0,1))

To make the plot in Leaflet, we have to add the tiles, and then the polygons defined by the sf object nc_merged. Since we want red to be associated with the proportion of Republican votes, we will map ‘1-r_prop’ to color. Note that we also add popups with the actual proportions, so that if you click on the map, it will show the district number and the proportion of Republican votes.

leaflet_nc <- leaflet(nc_merged) %>% 
   addTiles() %>% 
   addPolygons(
      weight = 1, fillOpacity = 0.7,
      color = ~pal(1 - r_prop),
      popup = ~paste("District", district, "</br>", round(r_prop, 4))) %>% 
   setView(lng = -80, lat = 35, zoom = 7)

leaflet_nc

Plotly Interactive Graphics

ggplotly, built and maintained by plotly, allows you to convert any ggplot visualization into a plotly visualization using the ggplotly() function.

Below we create a standard static ggplot object.

library(plotly)

p11 <- gapminder %>% 
   mutate(logGDPpercap = log(gdpPercap)) %>% 
   ggplot(aes(lifeExp, logGDPpercap)) +
   stat_density2d(geom = 'polygon', aes(fill = ..level..))

print(p11)

All you need to do is pass it the ggplotly() function and it creates an interactive graphic.

p11 <- ggplotly(p11)
p11

You can also do direct plotly functions, skipping ggplot entirely. This is especially useful when they have a chart format that isn’t easily available in ggplot, such as a stock candlestick chart.

library(tidyquant)

prices <- tq_get("GOOGL")

prices %>%
   plot_ly(x = ~date,
           type = "candlestick",
           open = ~open,
           close = ~close,
           high = ~high,
           low = ~low, 
           split = ~symbol)

Acknowledgements

  • Thanks to Gina Reynolds for the racing bar charts demo.
  • Here’s another racing bar chart demo I adapted from amrrs and the associated article at R Bloggers.
  • The leaflet section was adapted from here.